Robust speech recognition in car environments

نویسندگان

  • Makoto Shozakai
  • Satoshi Nakamura
  • Kiyohiro Shikano
چکیده

A user-friendly speech interface in a car cabin is highly needed for safety reasons. This paper will describe a robust speech recognition method that can cope with additive noises and multiplicative distortions. A known additive noise, a source signal of which is available, might be canceled by NLMSVAD(Normalized Least Mean Squares with frame-wise Voice Activity Detection). On the other hand, an unknown additive noise, a source signal of which is not available, is suppressed with CSS(Continuous Spectral Subtraction). Furthermore, various multiplicative distortions are simultaneously compensated with ECMN(Exact Cepstrum Mean Normalization) which is speakerdependent/environment-dependent CMN for speech/non-speech. Evaluation results of the proposed method for car cabin environments are finally described.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Auditory-based Acoustic Distinctive Features and Spectral Cues for Robust Automatic Speech Recognition in Low-SNR Car Environments

In this paper, a multi-stream paradigm is proposed to improve the performance of automatic speech recognition (ASR) systems in the presence of highly interfering car noise. It was found that combining the classical MFCCs with some auditory-based acoustic distinctive cues and the main formant frequencies of a speech signal using a multi-stream paradigm leads to an improvement in the recognition ...

متن کامل

Robust automatic speech recognition for accented Mandarin in car environments

This paper addresses the issues of robust automatic speech recognition (ASR) for accented Mandarin in car environments. A robust front-end is proposed, which adopts a Minimum Mean-Square Error (MMSE) estimator to suppress the background noise in frequency domain, and then implements spectrum smoothing both in time and frequency index to compensate those spectrum components distorted by the nois...

متن کامل

Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-SNR car environments

This paper presents an evaluation of the use of some auditorybased distinctive features and formant cues for robust automatic speech recognition (ASR) in the presence of highly interfering car noise. Comparative experiments have indicated that combining the classical MFCCs with some auditory-based acoustic distinctive cues and either the main formant magnitudes or the formant frequencies of a s...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998